预训练模型已在许多代码智能任务中有效。这些模型在大规模未标记的语料库中进行了预训练,然后在下游任务中进行了微调。但是,由于预训练和下游任务的输入是不同的形式,因此很难充分探索预训练模型的知识。此外,微调的性能强烈依赖于下游数据的量,而实际上,具有稀缺数据的场景很常见。自然语言处理(NLP)领域的最新研究表明,迅速调整,一种调整的新范式,减轻上述问题并在各种NLP任务中实现了有希望的结果。在迅速调整中,在调整过程中插入的提示提供了特定于任务的知识,这对于具有相对较少数据的任务特别有益。在本文中,我们凭经验评估了代码智能任务中迅速调整的用法和效果。我们对流行的预训练模型Codebert和codet5进行及时调整,并尝试三个代码智能任务,包括缺陷预测,代码摘要和代码翻译。我们的实验结果表明,在所有三个任务中,迅速调整始终优于微调。此外,及时调整在低资源场景中显示出很大的潜力,例如,对于代码摘要,平均将微调的BLEU分数提高了26%以上。我们的结果表明,我们可以调整代码智能任务的迅速调整,以实现更好的性能,尤其是在缺乏特定于任务的数据时,我们可以调整及时调整。
translated by 谷歌翻译
从历史上看,患者数据集已用于开发和验证PET/MRI和PET/CT的各种重建算法。为了使这种算法开发,无需获得数百个患者检查,在本文中,我们展示了一种深度学习技术,可以从丰富的全身MRI中产生合成但逼真的全身宠物纹状体。具体来说,我们使用56 $^{18} $ F-FDG-PET/MRI考试的数据集训练3D残差UNET来预测全身T1加权MRI的生理PET摄取。在训练中,我们实施了平衡的损失函数,以在较大的动态范围内产生逼真的吸收,并沿着层析成像线的响应线对模仿宠物的获取产生计算的损失。预测的PET图像预计会产生合成宠物飞行时间(TOF)正式图,可与供应商提供的PET重建算法一起使用,包括使用基于CT的衰减校正(CTAC)和基于MR的衰减校正(MRAC(MRAC) )。由此产生的合成数据概括了生理学$^{18} $ f-fdg摄取,例如高摄取量位于大脑和膀胱,以及肝脏,肾脏,心脏和肌肉的吸收。为了模拟高摄取的异常,我们还插入合成病变。我们证明,该合成PET数据可以与实际PET数据互换使用,用于比较CT和基于MR的衰减校正方法的PET量化任务,与使用真实数据相比,在平均值中实现了$ \ leq 7.6 \%$误差。这些结果共同表明,所提出的合成PET数据管道可以合理地用于开发,评估和验证PET/MRI重建方法。
translated by 谷歌翻译
已知历史和未来的上下文信息对于准确的声学建模很重要。但是,获取未来的上下文会带来流式ASR的延迟。在本文中,我们提出了一个新的框架 - 块,模拟未来的上下文和解码(Cuside)以进行流语言识别。引入了一个新的仿真模块,以递归地模拟未来的上下文帧,而无需等待未来的上下文。使用自我监督的损失与ASR模型共同训练模拟模块;ASR模型通过通常的ASR损失(例如我们实验中使用的CTC-CRF)进行了优化。实验表明,与使用真实的未来框架作为正确的上下文相比,使用模拟的未来上下文可以大大降低延迟,同时保持识别精度。使用Cuside,我们在Aishell-1数据集上获得了新的最新流媒体ASR结果。
translated by 谷歌翻译
与传统的刚性机器人相比,由于合规性,安全性和低成本,软机器人由于其优点而引起了越来越多的关注。作为软机器人的重要组成部分,软机器人夹具还显示出其优越的同时抓住具有不规则形状的物体。已经进行了最近的研究,以通过调整可变有效长度(VEL)来改善其抓握性能。然而,通过多室设计或可调刚度形状记忆材料实现的Vel需要复杂的气动电路设计或耗时的相变过程。这项工作提出了一种由3D印刷灯丝,忍者克朗的折叠式软机器人执行器。它是通过高速模型进行实验测试和表示的。进行数学和有限元建模,以研究所提出的软致动器的弯曲行为。此外,提出了一种拮抗约束机制来实现VEL,并且实验表明实现了更好的符合性。最后,设计了一种双模夹具,以展示Vel对抓取性能的进步。
translated by 谷歌翻译
用于医学图像重建的深度神经网络传统上使用高质量的地基图像作为训练目标训练。最近关于噪声的工作(N2N)已经示出了使用与具有地面真理的多个噪声测量的潜力。然而,现有的基于N2N的方法不适合于从经历非身份变形的物体的测量来学习。本文通过补偿对象变形来提出用于训练深层重建网络的变形补偿学习(DecoLearn)方法来解决此问题。DecoLearn的一个关键组件是一个深度登记模块,它与深度重建网络共同培训,没有任何地理监督。我们在模拟和实验收集的磁共振成像(MRI)数据上验证了甲板,并表明它显着提高了成像质量。
translated by 谷歌翻译
最近,深度学习技术已被广泛用于图像识别领域。但是,其主要应用是对普通图片和常见场景的识别和检测。有效,有效地分析图像采集系统在无人机(UAVS)上获得的遥感图像(UAVS)的遥感图像是一项挑战,其中包括确定目标和其位置的计算。与普通图像或图像相比,空中遥感图像具有不同的拍摄角度和方法,这使得遥感图像在某些区域起着不可替代的作用。在这项研究中,提出了一种新的目标检测和识别方法,该方法是基于深度卷积神经网络(CNN)提出的,用于提供图像的多层次信息,并结合用于生成多式区域的区域建议网络。兴趣。所提出的方法产生的结果比传统方式获得的结果要准确和精确得多。这表明本文提出的模型在遥感图像识别中显示出巨大的适用性潜力。
translated by 谷歌翻译
Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation; 2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher; 3) Weak regularization is preferred; etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on ImageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://github.com/OliverRensu/TinyMIM.
translated by 谷歌翻译
Existing automated techniques for software documentation typically attempt to reason between two main sources of information: code and natural language. However, this reasoning process is often complicated by the lexical gap between more abstract natural language and more structured programming languages. One potential bridge for this gap is the Graphical User Interface (GUI), as GUIs inherently encode salient information about underlying program functionality into rich, pixel-based data representations. This paper offers one of the first comprehensive empirical investigations into the connection between GUIs and functional, natural language descriptions of software. First, we collect, analyze, and open source a large dataset of functional GUI descriptions consisting of 45,998 descriptions for 10,204 screenshots from popular Android applications. The descriptions were obtained from human labelers and underwent several quality control mechanisms. To gain insight into the representational potential of GUIs, we investigate the ability of four Neural Image Captioning models to predict natural language descriptions of varying granularity when provided a screenshot as input. We evaluate these models quantitatively, using common machine translation metrics, and qualitatively through a large-scale user study. Finally, we offer learned lessons and a discussion of the potential shown by multimodal models to enhance future techniques for automated software documentation.
translated by 谷歌翻译
Text clustering and topic extraction are two important tasks in text mining. Usually, these two tasks are performed separately. For topic extraction to facilitate clustering, we can first project texts into a topic space and then perform a clustering algorithm to obtain clusters. To promote topic extraction by clustering, we can first obtain clusters with a clustering algorithm and then extract cluster-specific topics. However, this naive strategy ignores the fact that text clustering and topic extraction are strongly correlated and follow a chicken-and-egg relationship. Performing them separately fails to make them mutually benefit each other to achieve the best overall performance. In this paper, we propose an unsupervised text clustering and topic extraction framework (ClusTop) which integrates text clustering and topic extraction into a unified framework and can achieve high-quality clustering result and extract topics from each cluster simultaneously. Our framework includes four components: enhanced language model training, dimensionality reduction, clustering and topic extraction, where the enhanced language model can be viewed as a bridge between clustering and topic extraction. On one hand, it provides text embeddings with a strong cluster structure which facilitates effective text clustering; on the other hand, it pays high attention on the topic related words for topic extraction because of its self-attention architecture. Moreover, the training of enhanced language model is unsupervised. Experiments on two datasets demonstrate the effectiveness of our framework and provide benchmarks for different model combinations in this framework.
translated by 谷歌翻译
Cognitive Computing (COC) aims to build highly cognitive machines with low computational resources that respond in real-time. However, scholarly literature shows varying research areas and various interpretations of COC. This calls for a cohesive architecture that delineates the nature of COC. We argue that if Herbert Simon considered the design science is the science of artificial, cognitive systems are the products of cognitive science or 'the newest science of the artificial'. Therefore, building a conceptual basis for COC is an essential step into prospective cognitive computing-based systems. This paper proposes an architecture of COC through analyzing the literature on COC using a myriad of statistical analysis methods. Then, we compare the statistical analysis results with previous qualitative analysis results to confirm our findings. The study also comprehensively surveys the recent research on COC to identify the state of the art and connect the advances in varied research disciplines in COC. The study found that there are three underlaying computing paradigms, Von-Neuman, Neuromorphic Engineering and Quantum Computing, that comprehensively complement the structure of cognitive computation. The research discuss possible applications and open research directions under the COC umbrella.
translated by 谷歌翻译